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Monte-Carlo (MC) methods, based on random updates and the trial-and-error 
principle, are well suited to retrieve particle size distributions from small-angle 
scattering patterns of dilute solutions of scatterers. The size sensitivity of size 
determination methods in relation to the range of scattering vectors covered by the 
data is discussed. Improvements are presented to existing MC methods in which 
the particle shape is assumed to be known. A discussion of the problems with the 
ambiguous convergence criteria of the MC methods are given and a convergence 
criterion is proposed, which also allows the determination of uncertainties on the 
determined size distributions. 
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The search for generally applicable methods capable of deter- 
mining structural parameters from small-angle scattering pat- 
terns for a broad range of samples has yielded several viable 
methods. For monodisperse systems consisting of identical par- 
ticles, these methods attempt to find a free-form solution to the 
pair-distance distribution function p(r), whereas for polydis- 
perse systems the aim is to determine the distribution P(R) in 
real space. In both cases, relevant transformation should yield 
the observed scattering pattern (Rrauthauser et ai, 1996). 

There are Indirect Transform Methods (ITM) based on regu- 
larization techniques which either impose that the solution is 
as smooth as possible (Glatter, 1977; Glatter, 1979; Moore, 
1980; Svergun, 1991; Pedersen, 1994), or Bayesian and max- 
imum entropy ITM methods, which find a most likely solu- 
tion using a Bayesian approach and entropy maximization, 
respectively (Hansen, 2000; Hansen & Pedersen, 1991). There 
are also methods available based on Titchmarsh transforms 
for determining size distributions (Mulato & Chambouley- 
ron, 1996; Fedorova & Schmidt, 1978). 

Another class of methods, such as the Structure Interference 
Method (SIM) (Krauthauser et ai, 1996) and some Monte- 
Carlo (MC) methods (Martelli & Di Nunzio, 2002; Di Nun- 
zio et ai, 2004), assume a particular shape and do not appear 
to require smoothness constraints. These only have a posi- 
tivity constraint and have so far been limited to size distri- 
butions of sphere-shaped scatterers. These methods can be 
used to extract the particle size distribution function of sys- 
tems of (dilute 1 ) scatterers whose shape is known or assumed 
(Krauthauser et ai, 1996; Martelli & Di Nunzio, 2002; Di Nun- 
zio et ai, 2004). The MC variant approaches the optimization 
by trial-and-eiTor, whereas the SIM uses a conjugate gradi- 
ent approach (Krauthauser, 1994). Both are conceptually eas- 



ier than the ITM or those based on Titchmarsh transforms, 
and provide stable and unique solutions (Martelli & Di Nun- 
zio, 2002; Di Nunzio et ai, 2004; Krauthauser et ai, 1996). 

Upon implementation of one such method by Martelli & 
Di Nunzio (2002), hereafter referred to as "The Martelli 
method", several noteworthy changes were made in the present 
work. We give a brief summary of the working principle, and 
highlight the differences from the Martelli method. Then, a 
general solution for detection limits is derived for particles in 
a polydisperse set. This aids the MC method as it allows for 
improved contribution scaling during the optimization proce- 
dure and indicates detectability limits in the final result. Lastly, 
a convergence criterion is defined for the MC method, allowing 
for the calculation of uncertainties in the resulting size distribu- 
tion. This method is applied to scattering data obtained during 
the synthesis of AlOOH nanoparticles. 

2. A brief overview of the implemented method 

Step 1: Preparation of the procedure 

The initial guess of the scattering intensity is calculated for 
a uniform random distribution of a number of spheres n s any- 
where between size bounds < R sp h < — % (where q m i„ is the 
smallest measured value of q = sin(0), with A the wave- 
length of the radiation and 29 the scattering angle), using: 



/mc(<?) = YTk=\ I F spKk (qRk) | 2 R 



(6-Pc) , 



(Rt 



(1) 



Where f sp h.k (<7^a ) is the Rayleigh form factor for a sphere, Rf, 
the radius for sphere k, and the pseudo-size distribution p*(R) 
is related to the number distribution p(R) through p*(R) = 
R Pc p(R). In other words, in this calculation (for reasons detailed 
in paragraph 3) the volume-squared scaling of each sphere 



1 The use of "dilute" means that the data should not be influenced by concentration effects. 
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contribution is partially compensated for by an exponent p c , 
so that the scaling follows R^ Pc \ with p c typically between 
2 < p c < 4. Incidentally, setting p c = 3 makes p*(R) identical 
to the volume-weighted size distribution. 

Subsequently, the MC intensity is brought in line with the 
measured intensity through optimization of the scaling fac- 
tor and background level using a least-squares residual min- 
imization procedure, minimizing the reduced chi-squared x 2 - 
(Pedersen, 1997): 



1 



N-M 



s(gi)-Wgi) 



(2) 



where 

/calc(^) = A X 7 MC (tfi) + b (3) 

and where N is the number of data points, M the number of 
degrees of freedom (unfortunately ill-defined in an MC model, 
and is set equal to two here for the intensity scaling parame- 
ter A and background contribution parameter b), 7 meas and 7 ca i c 
the measured intensity and calculated model intensity, respec- 
tively, and er, the estimated error on the data point (the estima- 
tion method for the error is detailed in paragraph 4). 



As the scattering intensity of particles scales proportional to 
their volume squared (radius to the sixth power for spheres), 
the scattered intensity of smaller particles in a polydisperse set 
is quickly drowned out by the disproportionally larger signal of 
larger particles. This effect, however, is partially compensated 
for by the different q-dependence of the scattering of the smaller 
particles. 



To investigate how large this compensatory effect is, we can 
define the "maximum observability" of a particle in a set as the 
maximum fractional contribution of that component to the total 
scattering pattern. I.e. the observability Obs mas ,i for component 
i in a scattering pattern of N independent contributions, mea- 
sured within the g-range delimited by q m \ n < q m . dx is defined 
as: 



Step 2: Optimization cycle 

The Monte-Carlo optimization cycle then begins, by picking 
a random sphere from the set of n s spheres, changing its radius 
to another random value within the bounds, recalculating the 
intensity of the entire set using equation 1 and reoptimizing the 
scaling factor and background level (eqn. 2 and 3), and checking 
if this radius change improves the agreement between measured 
and MC intensity, i.e. if the change reduces the x^-value. If it 
does, the change is accepted, otherwise rejected. A rejection- 
acceptance mechanism (occasionally accepting "bad moves") 
was found not to be necessary. 

This method differs from the Martelli method, in that the 
Martelli method continually attempts to add new sphere contri- 
butions to an ever growing set, leaving the prior established set 
contributions untouched. The adaptation presented here leaves 
the number of sphere contributions in the set unchanged, but 
repeatedly tries to change the radius of a random contribution 
in the set. 

Step 3: Convergence and post-optimization procedures 

The optimization is stopped once the condition x 2 < 1 nas 
been reached (c.f. paragraph 4). If convergence has not been 
reached within a certain number of steps (here set to 1 mil- 
lion) for a limited number of attempts, the pattern is consid- 
ered unsuitable for fitting with this method 2 . For visualization 
and analysis purposes, the set of spheres can be distributed in 
a histogram (weighted to compensate for p c ). The whole MC 
procedure is then repeated several tens of times in order to 
obtain information on the mean and standard deviation of the 
histogram points. 

3. Observability of isolated spheres in a polydisperse 
set 

2 It is very rare that a pattern is not described at the first attempt but can be desc 
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where the location q m is the value within the observed q-range 
where Obs mflA: , is largest. For the MC method, both Ij(q m ) and 
Ij(q m ) are defined by equation 1. 

When plotting Obs mal , for any size distribution of spheri- 
cal particles, it is evident that the observability scales with the 
sphere radius squared (c.f. Figure 1) for particles with sizes 
larger than R\i m w ir/q max . Particles smaller than Rn m exhibit 
an observability scaling in line with the volume-squared inten- 
sity scaling (i.e. radius to the sixth power). The observability is 
shown for three unimodal distributions (one uniform and two 
triangular distributions of 50000 spheres) with particle radii 
between 0.1 < R(A) < 350. The modes of the triangular distri- 
butions are set to 0. 1 and 350 for the "trailing" and "leading" tri- 
angular distributions, respectively (the distributions are shown 
in Figure 2). Figure 1 shows that the shape of the distribution 
has little visible effect on the shape of the observability. The 
absolute values of the observability are slightly dependent on 
the distribution, but more significantly dependent on the num- 
ber of contributions (as can be inferred from equation 4 directly, 
but is not graphically shown). 

i upon repetition, but it does happen occasionally. 
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measurement accuracy v (usually 1%). 
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Figure 1 

Observability for three unimodal distributions of spheres (whose size distribu- 
tions are shown in Figure 2), within 0.01 < q < 0.35. A change in slope is 
observed at R\\ m m Tr/q max . p c is zero in the calculation. 
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Figure 2 

Histograms (50 bins) of the distributions used for generating Figure 1 



The information on the observability can be used for three 
purposes. First and foremost, there is a clear indication of the 
limits of small-angle scattering for resolving the smaller sizes, 
and there is a link between the smallest measurable sizes and the 
maximum measured q (as the observability scales with radius to 
the sixth power for particles with radii smaller than nlq ms . x , their 
contribution vanishes rapidly). Note that it does not give infor- 
mation about the upper particle size detection limit, which is 
defined by the ability to distinguish the differences of scattering 
patterns of large particles. 

Secondly, the knowledge can be used to calculate, for any 
model fitting solution, the minimum number of particles 
required to make a measurable impact on the total scattering 
pattern. This is done by calculating the inverse observability for 
the resulting distribution, and multiplying this with the overall 



v Y!j=\ I A ( lm) 

U{q m ) 



(5) 



Note that for the MC method presented here, in order to com- 
pensating for the discrepancy between the number of sphere 
contributions n s used and the number of bins N they eventually 
end up in, we need to calculate: 



flnunMC.i 



'N 



(6) 



In this way, any plots of size distributions can contain a line 
indicating a rough estimate for the minimum detectable number 
of particles (c.f. Figures 5 and 4), which can prevent drawing 
erroneous conclusions from analysis artifacts. As stated above, 
the observability, and therefore these minimum required particle 
numbers, are directly dependent on the number of size divisions 
in the distribution. 

Thirdly, the disproportionate contribution of larger spheres 
in numerical integrations over size distributions can be reduced 
by, instead of determining the size distribution function p(R) in 
Imc / I F sp h(R) \ 2 R 6 p(R)dR, to determine a pseudo-size dis- 
tribution p*(R) = R Pc p(R) in equation 1. Upon determination 
of p*(R), the correct number- weighted distribution p(R) can be 
retrieved through division of p*(R) by /?{?. As indicated in para- 
graph 2, this compensatory power does not have to be equal to 
2, and can be tuned to further make the MC minimization more 
efficient. 

4. Data point weighting and convergence criterion 

Estimates of the level of uncertainty on each measured data 
point ("errors", 07 in equation 2) are invaluable to assessing the 
veracity of model fitting results (i.e. to determine whether the 
analysis provided a solution to within the uncertainty estimate). 
Additionally, its knowledge can help unlink the model fitting 
result from more arbitrary parameters such as the measured 
intensity integration bin width or the number of data points. By 
weighting of the goodness-of-fit parameter (used in the least- 
squares minimization function) by this error (c.f. Equation 2) 
uncertainties on the MC solution can be established. 

These errors can be estimated to be at their very least the 
counting-statistics-based Poisson error (cr = \fl where / is 
the number of detected counts). Furthermore, if this estimate 
is exceeded by the sample standard error of the mean of the 
values contained in each individual integration bin (defined in 
equation 7), the sample standard error should be the preferred 
error estimate for that bin (as it can account for some detector 
irregularities). This sample standard error for each bin (£bin) is 
defined as: 



E bin - J N ~ 1 Ei=l ( 7 < - 7 bin) 



(7) 



Where /bin is the mean intensity in the bin, /, the intensity of 
datapoint i in the bin and N p the total number of points in the 
bin. 
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Lastly, the absolute uncertainty of the measured intensity is 
commonly challenging to be below 1%. Thus, 1% (or however 
small the instrumental error is estimated to be) of the measured 
intensity should be preferred if it exceeds the other two esti- 
mates, These errors can then be used in equation 2 to determine 
the goodness-of-fit parameter as acceptance-rejection criterion 
for a MC proposition. 

The advantage of using these errors in the expression for 
is that if this parameter drops below one, the deviations between 
model- and measured intensities are on average smaller than the 
statistical uncertainties. This thus provides a cut-off criterion for 
(for example) the MC method, allowing for the estimation of the 
mean and standard deviations of the final particle size histogram 
(shown in Figure 4). Additionally, by using these errors in the 
expression for the goodness-of-fit parameter, the intensities are 
weighted by their relative errors in the fitting procedures and 
thus become less sensitive to arbitrary values such as bin widths 
or number of data points used in the fit. 

Since the MC method does not provide us with an intensity at 
the same level as the measured intensity, and since there often 
is a constant background associated with small-angle scatter- 
ing patterns for a variety of reasons (Ruland, 1971; Koberstein 
et al, 1980), these two parameters will have to be determined 
separately. Thus, after every MC proposed change, but before 
calculation of the goodness-of-fit, an intermediate least-squares 
minimization routine is applied to optimize the model inten- 
sity scaling and background parameters (eqn. 3). If required, 
the least-squares minimization method can be expanded to 
include more terms, at the cost of speed and stability. One 
reason for such an inclusion could be to include a power- 
law slope (with optional cut-off) to compensate for scattering 
from large structures or some inter-particle scattering effects 
(Pedersen, 1994; Beaucage, 1995; Beaucage, 1996). 

5. Uncertainties on the resulting distribution 

One common criticism of MC methods is the potential for ambi- 
guity in the result, or the risk of over-fitting the data. While a 
rough estimate for the number of independently determinable 
radii (c.q. histogram bins) can be found using the sampling 
theorem (since r res = nlq miix , N w qm m lq mdx , assuming the 
largest measurable dimension is identical to the measurement 
limit (Hansen & Pedersen, 1991; Moore, 1980; Taupin & Luz- 
zati, 1982)), it also has to be dependent on the uncertainty of the 
underlying dataset. An alternative practical way of investigating 
the result validity is to determine the errors (standard deviation 
in this case) on the result. 

These can be obtained with the present MC method, by per- 
forming the MC fit to the same data several tens of times, and 
for each time optimizing until \ 2 r < 1 has been achieved. By 
distributing the results in histograms with a fixed array of bins, 
the mean value and standard deviation for each bin can be deter- 
mined. Naturally, the relative standard deviation is a function of 
the bin size, so that more numerous narrower bins will have 
a larger standard deviation than fewer wider bins (within rea- 
son). Additionally, the level of the observability (equation 4) is 
dependent on the number of bins, leading to a trade-off between 



minimum observable number of particles n m i„Mc,i an d the num- 
ber of bins. For the most common equidistant histogramming, 
however, it is up to the user to determine the best suited number 
of size histogram bins (or to use a value close to the sampling- 
theorem-derived value). 

The procedure then provides the user with a clear overview 
of the uncertainties and detection limits attached to the deter- 
mination of polydispersity from small-angle scattering patterns, 
which can be used for further extraction of meaningful numbers 
from the resulting distribution. 

6. Experimental 

6.1. Synthesis 

Boehmite (AlOOH) particles were synthesized in-situ using 
an automated and modified version of a high-pressure high- 
temperature reactor (Becker et al, 2010). The sapphire capillary 
in which the reaction takes place has an inner diameter of 1 .0 
mm and an outer diameter of 1 .57 mm. The particles were syn- 
thesized from a solution of 0.5M Al(NOs)3 precursor in water. 
The start of the reaction was considered to be the moment at 
which the pressurized solution (maintained at a pressure of 250 
bar) is heated to its reaction temperature of 275 degrees centi- 
grade. The measurement used in this paper was obtained 1700 
seconds from the start of the reaction. Further details and results 
will be presented in a forthcoming paper. 

6.2. Beamline details 

Synchrotron SAXS experiments were performed at the 
BL45XU beam line of the SPring-8 synchrotron in Japan. The 
beam was collimated to a 0.4 by 0.2 mm beam (horizontal by 
vertical, respectively), with photons with a wavelength of 0.09 
nm. The sample-to-detector distance was 2.59 meter. The scat- 
tering patterns were recorded on a Pilatus 300k detector whose 
total surface area covers 33.5 by 254 mm, consisting of 195 by 
1475 pixels measuring 0. 1 72 by 0. 1 72 mm. Transmission values 
were determined using in-line ionization chambers. The polar- 
ization factor was assumed to be 0.95. The measurements were 
collected at a rate of 1 Hz. 

6.3. Data correction 

The data were corrected for background (water at 275 
degrees centigrade and 250 bar), incoming flux, measurement 
time, transmission factor, polarization factor, spherical correc- 
tion factor and calibrated to absolute units using a Glassy car- 
bon sample from series H, supplied by Dr. Ilavsky from APS 
(Zhang et al, 2009). Statistics were calculated according to the 
procedure outlined in paragraph 4, with the minimum possible 
error set to 1% of the measured intensity. 

7. Results and discussion 

The collected dataset, its error and the MC fit are shown in Fig- 
ure 3, where the MC fit intensity is the average of 100 repeti- 
tions of the MC procedure. While a single run also delivers a 
model intensity to within the determined error (on average), the 
mean intensity is shown here as it matches the mean of the size 
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distributions shown in Figures 5 and 4. In Figure 5, the pseudo- 
size distribution is shown, as determined from the MC proce- 
dure using p c = 3. The error bars indicate the standard devia- 
tion of the histogrammed value for each of 100 repetitions. This 
figure also contains the line indicating the minimum number of 
visible particles required « m ,„MC,! (which here is proportional to 
the radius due to the choice of p c ). 

Transformed to a number size distribution the histogram 
becomes that shown in Figure 4. It is immediately clear that 
the larger particles are only present in the solution in minuscule 
amounts, and that there is a dip in the number of particles with a 
radius around 5 nanometer, with many particles slightly smaller 
and larger than that size. 

In this example, the number of histogram bins is 25, but one 
can choose more or fewer bins. The effect of this is shown 
in the pseudo-size distributions in Figures 6 and 7 for 60 and 
15 histogram bins, respectively. This clearly shows the relation 
between the standard deviation in the bins and the minimum 
observable number of particles. If the number of histogram bins 
is high, the uncertainty for each value is equally large, and the 
minimum number of particles of each bin size required to make 
a measurable impact on the scattering pattern increases. If the 
number of bins is reduced, both the uncertainty and n m ,„MC,i 
reduces, at the cost of detail. For further (and more involved) 
improvement in information content retrieval, the bins in Fig- 
ure 5 that fall below the n,„i„Mc,i li ne could be combined (which 
would render them "observable"), and the bins above this line 
could be further subdivided for improved information extrac- 
tion. 




Figure 4 

Size distribution P(R), standard errors and minimum observable number of 
particles obtained by transforming the distribution P*(R) shown in Figure 5. 
Shown with limited vertical axis range for clarity. 
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Figure 3 

Data (black) and MC fit (red, using 1000 spheres) for one one second measure- 
ment in a time series of AlOOH nano particles in aqueous solution. The MC fit 
is at convergence. 



Figure 5 

Pseudo-size distribution P* (R) with p c = 3 as used for the MC fit shown in 
Figure 3. Error bars indicate sample standard deviation over 100 repetitions. 
Minimum observable number of spheres n mjnM cj shown as blue line. 



short communications 




10 20 30 

Sphere radius (10~ 9 m) 



Figure 6 

The MC-determined P* (R) of Figure 5 histogrammed in 60 bins. Minimum 
observable number of spheres « m /„Mc.i shown as blue line. 
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Figure 7 

The MC-determined P*(R) of Figure 5 histogrammed in 15 bins. Minimum 
observable number of spheres n minU c.i shown as blue line. 



8. One more thing... 

All the above results were obtained assuming that the scatterers 
are spherical in shape. This does not have to be the coiTect par- 
ticle shape for the method to arrive at a solution. As mentioned 
before, the size distribution and shape cannot be uniquely sepa- 
rated from scattering patterns (which has been tested for simu- 
lated isotropic scattering patterns from polydisperse sets of pro- 
late and oblate ellipsoids). The solution from the MC method, 
then, shows the user what the size distribution would be the 
scattering pattern originated from spherical particles. 



If the shape of the scatterers is known from other investiga- 
tions such as electron microscopy, and deviates from a spherical 
shape, this information can be used to obtain the correct size 
distribution for that particular shape (Pedersen et al, 1996). 
This can be done by either adjusting the particular scattering 
function in the MC method, or by analysis (or rather decon- 
volution) of the coiTelation function j(r) which can be calcu- 
lated from the sphere-based MC method result (Feigin & Sver- 
gun, 1987). However, it should be kept in mind that only one 
length distribution can be uniquely determined due to the lim- 
ited dimensionality (information content) of the isotropic scat- 
tering data. 

9. Conclusions 

Discussed in this paper are modifications to the Martelli MC 
method, the general veracity of the result, and the application 
of it to a SAXS measurement. It is shown that by using the 
methodology described in this paper, a particle size distribution 
can be retrieved from a scattering pattern, uncertainties can be 
estimated for the particle size distribution, and the minimum 
number of particles indicated for each size required to make a 
measurable impact on the scattering pattern can be indicated for 
each size. 

The MC code is available for inspection, improvements and 
application and will be freely supplied by the author upon 
request. 
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